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NUCLEIC ACIDS AND PROTEINS FROM STREPTOCOCCUS GROUPS A&B 

All documents cited herein are incorporated by reference in their entirety. 
TECHNICAL FIELD 

This invention relates to nucleic acid and proteins from the bacteria Streptococcus agalactiae (GBS) and 
Streptococcus pyogenes (GAS). 

BACKGROUND ART 

Once thought to infect only cows, the Gram-positive bacterium Streptococcus agalactiae (or "group B 
streptococcus", abbreviated to "GBS") is now known to cause serious disease, bacteremia and 
meningitis, in immunocompromised individuals and in neonates. There are two types of neonatal 
infection. The first (early onset, usually within 5 days of birth) is manifested by bacteremia and 
pneumonia. It is contracted vertically as a baby passes through the birth canal GBS colonises the vagina 
of about 25% of young women, and approximately 1% of infants born via a vaginal birth to colonised 
mothers will become infected Mortality is between 50-70%. The second is a meningitis that occurs 10 to 
60 days after birth. If pregnant women are vaccinated with type HI capsule so that the infants are 
passively immunised, the incidence of the late onset meningitis is reduced but is not entirely eliminated 

The "B" in "GBS" refers to the Lancefield classification, which is based on the antigenicity of a 
carbohydrate which is soluble in dilute acid and called the C carbohydrate. Lancefield identified 13 types 
of C carbohydrate, designated A to O, that could be serologically differentiated The organisms that 
most commonly infect humans are found in groups A, B, D, and G. Within group B, strains can be 
divided into 8 serotypes (la, lb, Ia/c, n, m, IV, V, and VI) based on the structure of their 
polysaccharide capsule. 

Group A streptococcus ("GAS", S.pyogenes) is a frequent human pathogen, estimated to be present in 
between 5-15% of normal individuals without signs of disease. When host defences are compromised, 
or when the organism is able to exert its virulence, or when it is introduced to vulnerable tissues or hosts, 
however, an acute infection occurs. Diseases include puerperal fever, scarlet fever, erysipelas, 
pharyngitis, impetigo, necrotising fasciitis, myositis and streptococcal toxic shock syndrome. 

S.pyogenes is typically treated using antibiotics. Although S.agalactiae is inhibited by antibiotics, 
however, it is not killed by penicillin as easily as GAS. Prophylactic vaccination is thus preferable. 

Current GBS vaccines are based on polysaccharide antigens, although these suffer from poor 
immunogenicity. Anti-idiotypic approaches have also been used £.g. W099/54457). There remains a 
need, however, for effective adult vaccines against S.agalactiae infection. There also remains a need for 
vaccines against S.pyogenes infection. 

It is an object of the invention to provide proteins which can be used in the development of such 
vaccines. The proteins may also be useful for diagnostic purposes, and as targets for antibiotics. 
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DISCLOSURE OF THE INVENTION 

The invention provides proteins comprising the S.agalactiae amino acid sequences disclosed in the 
examples, and proteins comprising the S.pyogenes amino acid sequences disclosed in the examples. 
These amino acid sequences are the even SEQ IDs between 1 and 10960. 

5 It also provides proteins comprising amino acid sequences having sequence identity to the S.agalactiae 
amino acid sequences disclosed in the examples, and proteins comprising amino acid sequences having 
sequence identity to the S.pyogenes amino acid sequences disclosed in the examples. Depending on the 
particular sequence, the degree of sequence identity is preferably greater than 50% £.g. 60%, 70%, 
80%, 90%, 95%, 99% or more). These proteins include homologs, orthologs, allelic variants and 

10 afunctional mutants. Typically, 50% identity or more between two proteins is considered to be an 
indication of functional equivalence. Identity between proteins is preferably determined by the 
Smith- Waterman homology search algorithm as implemented in the MPSRCH program (Oxford 
Molecular), using an affine gap search with parameters gap open penalty-12 and gap extension 
penal ty=l. 

15 Preferred proteins of the invention are GBS1 to GBS689 (see Table TV). 

The invention further provides proteins comprising fragments of the S.agalactiae amino acid sequences 
disclosed in the examples, and proteins comprising fragments of the S.pyogenes amino acid sequences 
disclosed in the examples. The fragments should comprise at least n consecutive amino acids from the 
sequences and, depending on the particular sequence, n is 7 or more (e.g. 8, 10, 12, 14, 16, 18, 20, 30, 
20 40, 50, 60, 70, 80, 90, 100 or more). Preferably the fragments comprise one or more epitopes from the 
sequence. Other preferred fragments are (a) the N-terminal signal peptides of the proteins disclosed in 
the examples, (b) the proteins disclosed in the examples, but without their N-terminal signal peptides, (c) 
fragments common to the related GAS and GBS proteins disclosed in the examples, and (d) the proteins 
disclosed in the examples, but without their N-terminal amino acid residue. 

25 The proteins of the invention can, of course, be prepared by various means (e.g. recombinant 
expression, purification from GAS or GBS, chemical synthesis etc.) and in various forms (e.g. native, 
fusions, glycosylated, noiv-glycosylated etc.). They are preferably prepared in substantially pure form 
(i.e. substantially free from other streptococcal or host cell proteins) or substantially isolated form. 
Proteins of the invention are preferably streptococcal proteins. 

30 According to a further aspect, the invention provides antibodies which bind to these proteins. These 
may be polyclonal or monoclonal and may be produced by any suitable means <$.g. by recombinant 
expression). To increase compatibility with the human immune system, the antibodies may be chimeric 
or humanised (e.g. Breedveld (2000) Lancet 355(9205):735-740; Gorman & Clark (1990) Semin. 
Immunol. 2:457-466), or fully human antibodies may be used. The antibodies may include a detectable 

35 label (e.g. for diagnostic assays). 
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According to a further aspect, the invention provides nucleic acid comprising the S.agalactiae 
nucleotide sequences disclosed in the examples, and nucleic acid comprising the S.pyogenes nucleotide 
sequences disclosed in the examples. These nucleic acid sequences are the odd SEQ IDs between 1 and 
10966. 

5 In addition, the invention provides nucleic acid comprising nucleotide sequences having sequence 
identity to the S.agalactiae nucleotide sequences disclosed in the examples, and nucleic acid comprising 
nucleotide sequences having sequence identity to the S.pyogenes nucleotide sequences disclosed in the 
examples. Identity between sequences is preferably determined by the Smith-Waterman homology 
search algorithm as described above. 

10 Furthermore, the invention provides nucleic acid which can hybridise to the S.agalactiae nucleic acid 
disclosed in the examples, and nucleic acid which can hybridise to the S.pyogenes nucleic acid disclosed 
in the examples preferably under 'high stringency' conditions (e.g. 65°C in O.lxSSC, 0.5% SDS 
solution). 

Nucleic acid comprising fragments of these sequences are also provided. These should comprise at least 
15 n consecutive nucleotides from the S.agalactiae or S.pyogenes sequences and, depending on the 
particular sequence, n is 10 or more (e.g. 12, 14, 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 
200 or more). The fragments may comprise sequences which are common to the related GAS and GBS 
sequences disclosed in the examples. 

According to a further aspect, the invention provides nucleic acid encoding the proteins and protein 
20 fragments of the invention. 

The invention also provides: nucleic acid comprising nucleotide sequence SEQ ID 10967; nucleic acid 
comprising nucleotide sequences having sequence identity to SEQ ID 10967; nucleic acid which can 
hybridise to SEQ ID 10967 (preferably under 'high stringency' conditions); nucleic acid comprising a 
fragment of at least n consecutive nucleotides from SEQ ID 10967, wherein n is 10 or more e.g. 12, 14, 
25 15, 18, 20, 25, 30, 35, 40, 50, 60, 70, 80, 90, 100, 150, 200, 250, 300, 350, 400, 450, 500, 600, 700, 800, 
900, 1000, 1500, 2000, 3000, 4000, 5000, 10000, 100000, 1000000 or more 1 

Nucleic acids of the invention can be used in hybridisation reactions (e.g. Northern or Southern blots, or 
in nucleic acid microarrays or 'gene chips') and amplification reactions (e.g. PCR, SDA, SSSR, LCR, 
TMA, NASBA etc.) and other nucleic acid techniques. 

30 It should also be appreciated that the invention provides nucleic acid comprising sequences 
complementary to those described above (e.g. for antisense or probing, or for use as primers). 

Nucleic acid according to the invention can, of course, be prepared in many ways by chemical 
synthesis, from genomic or cDNA libraries, from the organism itself etc.) and can take various forms 
(e.g. single stranded, double stranded, vectors, primers, probes, labelled etc.). The nucleic acid is 
35 preferably in substantially isolated form. 
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Nucleic acid according to the invention may be labelled e.g. with a radioactive or fluorescent label. This 
is particularly useful where the nucleic acid is to be used in nucleic acid detection techniques e.g. where 
the nucleic acid is a primer or as a probe for use in techniques such as PGR, LCR, TMA, NASBA etc. 

In addition, the term "nucleic acid" includes DNA and RNA, and also their analogues, such as those 
5 containing modified backbones, and also peptide nucleic acids (PNA) etc. 

According to a further aspect, the invention provides vectors comprising nucleotide sequences of the 
invention (e.g. cloning or expression vectors) and host cells transformed with such vectors. 

According to a further aspect, the invention provides compositions comprising protein, antibody, and/or 
nucleic acid according to the invention. These compositions may be suitable as immunogenic 
10 compositions, for instance, or as diagnostic reagents, or as vaccines. 

The invention also provides nucleic acid, protein, or antibody according to the invention for use as 
medicaments (p.g. as immunogenic compositions or as vaccines) or as diagnostic reagents. It also 
provides the use of nucleic acid, protein, or antibody according to the invention in the manufacture of: (i) 
a medicament for treating or preventing disease and/or infection caused by streptococcus; (ii) a 
15 diagnostic reagent for detecting the presence of streptococcus or of antibodies raised against 
streptococcus; and/or (iii) a reagent which can raise antibodies against streptococcus. Said 
streptococcus may be any species, group or strain, but is preferably S.agalactiae, especially serotype 
HI or V, or S.pyogenes. Said disease may be bacteremia, meningitis, puerperal fever, scarlet fever, 
erysipelas, pharyngitis, impetigo, necrotising fasciitis, myositis or toxic shock syndrome. 

20 The invention also provides a method of treating a patient, comprising administering to the patient a 
therapeutically effective amount of nucleic acid, protein, and/or antibody of the invention. The patient 
may either be at risk from the disease themselves or may be a pregnant woman ('maternal immunisation' 
e.g. Glezen & Alpers (1999) Clin. Infect Dis. 28:219-224). 

Administration of protein antigens is a preferred method of treatment for inducing immunity. 

25 Administration of antibodies of the invention is another preferred method of treatment This method of 
passive immunisation is particularly useful for newborn children or for pregnant women. This method 
will typically use monoclonal antibodies, which will be humanised or fully human. 

The invention also provides a kit comprising primers {e.g. PCR primers) for amplifying a template 
sequence contained within a Streptococcus (e.g. S.pyogenes or S.agalactiae) nucleic acid sequence, the 
30 kit comprising a first primer and a second primer, wherein the first primer is substantially complementary r » 
to said template sequence and the second primer is substantially complementary to a complement of said 
template sequence, wherein the parts of said primers which have substantial complementarity define the 
termini of the template sequence to be amplified. The first primer and/or the second primer may include 
a detectable label (e.g. a fluorescent label). 
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The invention also provides a kit comprising first and second single-stranded oligonucleotides which 
allow amplification of a Streptococcus template nucleic acid sequence contained in a single- or double- 
stranded nucleic acid (or mixture thereof), wherein: (a) the first oligonucleotide comprises a primer 
sequence which is substantially complementary to said template nucleic acid sequence; (b) the second 

5 oligonucleotide comprises a primer sequence which is substantially complementary to the complement 
of said template nucleic acid sequence; (c) the first oligonucleotide and/or the second oligonucleotide 
comprise(s) sequence which is not compementary to said template nucleic acid; and (d) said primer 
sequences define the termini of the template sequence to be amplified The non-complementary 
sequence(s) of feature (c) are preferably upstream of (i.e. 5 1 to) the primer sequences. One or both of 

10 these (c) sequences may comprise a restriction site (e.g. EP-B-0509612) or a promoter sequence (e.g. 
EP-B-0505012). The first oligonucleotide and/or the second oligonucleotide may include a detectable 
label (e.g. a fluorescent label). 

The template sequence may be any part of a genome sequence (e.g. SEQ ID 10967). For example, it 
could be a rRNA gene (e.g. Turenne et al (2000) J. Clin. Microbiol 38:513-520;.SEQ IDs 12018-12024 
15 herein) or a protein-coding gene. The template sequence is preferably specific to GBS. 

The invention also provides a computer-readable medium (e.g. a floppy disk, a hard disk, a CD-ROM, a 
DVD etc.) and/or a computer database containing one or more of the sequences in the sequence listing. 
The medium preferably contains SEQ ED 10967. 

The invention also provides a hybrid protein represented by the formula NH 2 -A-[-X-L-]„-B-COOH, 

20 wherein X is a protein, of the invention, L is an optional linker amino acid sequence, A is an optional 
N-terminal amino acid sequence, B is an optional C-terminal amino acid sequence, and n is an integer 
greater than 1. The value of n is between 2 and jc, and the value of x is typically 3, 4, 5, 6, 7, 8, 9 or 10. 
Preferably n is 2, 3 or 4; it is more preferably 2 or 3; most preferably, n = 2. For each n instances, -X- 
may be the same or different. For each n instances of [-X-L-], linker amino acid sequence -L- may be 

25 present or absent. For instance, when n=2 the hybrid may be NH 2 -X r L r X 2 -L2-COOH, NH 2 -X,-X 2 - 
COOH, NH 2 -X r L,-X r COOH, NH r X r X 2 -I^-COOH, etc. Linker amino acid sequence^) -L- will 
typically be short (e.g. 20 or fewer amino acids i.e. 19, 18, 17, 16, 15, 14, 13, 12, 11, 10, 9, 8, 7, 6, 5, 4, 
3, 2, 1). Examples include short peptide sequences which facilitate cloning, poly -glycine linkers {i.e. Gly„ 
where n = 2, 3, 4, 5, 6, 7, 8, 9, 10 or more), and histidine tags (i.e. Hs n where n = 3, 4, 5, 6, 7, 8, 9, 10 

30 or more). Other suitable linker amino acid sequences will be apparent to those skilled in the art -A- and - 
B- are optional sequences which will typically be short (e.g. 40 or fewer amino acids i.e. 39, 38, 37, 36, 
35, 34, 33, 32, 31, 30, 29, 28, 27, 26, 25, 24, 23, 22, 21, 20, 19, 18, 17, 16, 15, 14, 13,12, 11, 10, 9, r 
7, 6, 5, 4, 3, 2, 1). Examples include leader sequences to direct protein trafficking, or short pe* 
sequences which facilitate cloning or purification (e.g. histidine tags i.e. Hjs„ where n - 3, 4, 5, 6. 

35 10 or more). Other suitable N-terminal and Gterminal amino acid sequences will be appanr 
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skilled in the ait. In some embodiments, each X will be a GBS sequence; in others, mixtures of GAS and 
GBS will be used. 

According to further aspects, the invention provides various processes. 

A process for producing proteins of the invention is provided, comprising the step of culturing a host 
cell of to the invention under conditions which induce protein expression. 

A process for producing protein or nucleic acid of the invention is provided, wherein the protein or 
nucleic acid is synthesised in part or in whole using chemical means. 

A process for detecting polynucleotides of the invention is provided, comprising the steps of: (a) 
contacting a nucleic probe according to the invention with a biological sample under hybridising 
conditions to form duplexes; and (b) detecting said duplexes. 

A process for detecting Streptococcus in a biological sample (e.g. blood) is also provided, comprising 
the step of contacting nucleic acid according to the invention with the biological sample under 
hybridising conditions. The process may involve nucleic acid amplification PCR, SDA, SSSR, 
LCR, TMA, NASBA etc.) or hybridisation (e.g. microarrays, blots, hybridisation with a probe in 
solution etc.). PCR detection of Streptococcus in clinical samples, in particular S.pyogenes, has been 
reported [see e.g. Louie et al (2000) CMAJ 163:301-309; Louie et al (1998) /. Clin. Microbiol 
36:1769-1771]. Clinical assays based on nucleic acid are described in general in Tang et al (1997) Clin. 
Chem. 43:2021-2038. 

A process for detecting proteins of the invention is provided, comprising the steps of: (a) contacting an 
antibody of the invention with a biological sample under conditions suitable for the formation of an 
antibody-antigen complexes; and (b) detecting said complexes. 

A process for identifying an amino acid sequence is provided, comprising the step of searching for 
putative open reading fiames or protein-coding regions within a genome sequence of S.agalactiae. This 
will typically involve In silico searching the sequence for an initiation codon and for an in-frame 
termination codon in the downstream sequence. The region between these initiation and termination 
codons is a putative protein-coding sequence. Typically, all six possible reading frames will be searched 
Suitable software for such analysis includes ORFFINDER (NCBI), GENEMARK [Borodovsky & 
Mclninch (1993) Computers Chem. 17:122-133), GLIMMER [Salzberg et al (1998) Nucleic Acids Res. 
26:544-548; Salzberg et al. (1999) Genomics 59:24-31; Delcher et al (1999) Nucleic Acids Res. 27:4636- 
4641], or other software which uses Markov models [e.g. Shmatkov et al. (1999) Bioinformatics 
15:874-876]. The invention also provides a protein comprising the identified amino acid sequence. These 
proteins can then expressed using conventional techniques. 

The invention also provides a process for determining whether a test compound binds to a protein of the 
invention. If a test compound binds to a protein of the invention and this binding inhibits the life cycle of 
the GBS bacterium, then the test compound can be used as an antibiotic or as a lead compound for the 
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design of antibiotics. The process will typically comprise the steps of contacting a test compound with a 
protein of the invention, and determining whether the test compound binds to said protein. Preferred 
proteins of the invention for use in these processes are enzymes £.g. tRNA synthetases), membrane 
transporters and ribosomal proteins. Suitable test compounds include proteins, polypeptides, 

5 carbohydrates, lipids, nucleic acids £.g. DNA, RNA, and modified forms thereof), as well as small 
organic compounds (e.g. MW between 200 and 2000 Da). The test compounds may be provided 
individually, but will typically be part of a library (e.g. a combinatorial library). Methods for detecting a 
binding interaction include NMR, filter-binding assays, gel-retardation assays, displacement assays, 
surface plasmon resonance, reverse two-hybrid etc. A compound which binds to a protein of the 

10 invention can be tested for antibiotic activity by contacting the compound with GBS bacteria and then 
monitoring for inhibition of growth. The invention also provides a compound identified using these 
methods. 

The invention also provides a composition comprising a protein or the invention and one or more of the 
following antigens: 

15 - a protein antigen from Helicobacter pylori such as Vac A, CagA, NAP, HopX, HopY [e.g. 
WO98/04702] and/or urease. 

- a protein antigen from Kmeningitidis serogroup B, such as those in W099/24578, W099/36544, 
WO99/57280, WO00/22430, Tettelin et al (2000) Science 287:1809-1815, Pizza et al (2000) 
Science 287:1816-1820 and W096/29412, with protein '287' and derivatives being particularly 

20 preferred. 

- an outer-membrane vesicle (OMV) preparation from Kmeningitidis serogroup B, such as those 
disclosed in WO01/52885; Bjune et al. (1991) Lancet 338(8775):1093-1096; Fukasawa et al (1999) 
Vaccine 17:2951-2958; Rosenqvist et dL (1998) Dev. Biol. Stand. 92:323-333 etc. 

- a saccharide antigen from Kmeningitidis serogroup A, C, W135 and/or Y, such as the 
25 oligosaccharide disclosed in Costantino et al. (1992) Vaccine 10:691-698from serogroup C [see 

also Costantino et al. (1999) Vaccine 17:1251-1263]. 

- a saccharide antigen from Streptococcus pneumoniae [e.g. Watson (2000) Pediatr Infect Dis J 
19:331-332; Rubin (2000) Pediatr Clin North Am 47:269-285, v; Jedizejas (2001) Microbiol Mol 
Biol Rev 65:187-207]. 

30 - an antigen from hepatitis A virus, such as inactivated virus [e.g. Bell (2000) Pediatr Infect Dis J 
19:1187-1188; Iwarson (1995) APMIS 103:321-326]. 

- an antigen from hepatitis B virus, such as the surface and/or core antigens [e.g. Gerlich et al (1990) 
Vaccine 8 Suppl:S63-68 & 79-80]. 

- an antigen from hepatitis C vims [e.g. Hsu et al. (1999) Clin Liver Dis 3:901-915]. 

35 - an antigen from Bordetella pertussis, such as pertussis holotoxin (PT) and filamentous 
haemagglutinin (FHA) from B.pertussis, optionally also in combination with pertactin and/or 
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agglutinogens 2 and 3 [e.g. Gustafsson et al (1996) N. Engl J. Med. 334:349-355; Rappuoli et al 
(1991) TIBTECH 9:232-238]. 

- a diphtheria antigen, such as a diphtheria toxoid [e.g. chapter 3 of Vaccines (1988) eds. Plotkin & 
Mortimer. ISBN 0-7216-1946-0] e.g. the CRM I97 mutant [e.g. Del Guidice et al (1998) Molecular 

5 Aspects of Medicine 19:1-70]. 

- a tetanus antigen, such as a tetanus toxoid [e.g. chapter 4 of Plotkin & Mortimer]. 

- a saccharide antigen from Haemophilus influenzae B. 

- an antigen from N.gonorrhoeae [e.g. W099/24578, W099/36544, WO99/57280]. 

- an antigen from Chlamydia pneumoniae [e.g. PCT/IB01/01445; Kalman et al (1999) Nature 
10 Genetics 21:385-389; Read et al. (2000) Nucleic Acids Res 28:1397^06; Shirai et al. (2000) J. 

Infect. Dis. 181(Suppl 3):S524-S527; WO99/27105; WO00/27994; WO00/37494]. 

- an antigen from Chlamydia trachomatis [e.g. W099/28475]. 

- an antigen from Porphyromonas gingivalis [e.g. Ross et al. (2001) Vaccine 19:4135-4142], 

- polio antigen(s) [e.g. Sutter et al (2000) Pediatr Clin North Am 47:287-308; Zimmerman & Spann 
15 (1999) Am Fam Physician 59:113-118, 125-126] such as EPV or OPV. 

- rabies antigen(s) [e.g. Dreesen (1997) Vaccine 15 Suppl:S2-6] such as lyophilised inactivated virus 
[e.g. MMWR Morb Mortal Wkly Rep 1998 Jan 16;47(1):12, 19; RabAvert™]. 

- measles, mumps and/or rubella antigens [e.g. chapters 9, 10 & 1 1 of Plotkin & Mortimer]. 

- influenza antigen(s) [e.g. chapter 19 of Plotkin & Mortimer], such as the haemagglutinin and/or 
20 neuraminidase surface proteins. 

- an antigen fromMoraxella catarrhalis [e.g. McMichael (2000) Vaccine 19 Suppl 1:S101-107]. 

- an antigen from Staphylococcus aureus [e.g. Kuroda et al (2001) Lancet 357(9264): 1225-1240; 
see also pages 1218-1219]. 

Where a saccharide or carbohydrate antigen is included, it is preferably conjugated to a carrier protein in 
25 order to enhance immunogenicity [e.g. Ramsay et al (2001) Lancet 357(9251):195-196; Lindberg (1999) 
Vaccine 17 Suppl 2.S28-36; Conjugate Vaccines (eds. Cruse et al.) ISBN 3805549326, particularly vol. 
10:48-114 etc.]. Preferred carrier proteins are bacterial toxins or toxoids, such as diphtheria or tetanus 
toxoids. The CRM !97 diphtheria toxoid is particularly preferred. Other suitable carrier proteins include 
the Nmeningitidis outer membrane protein [e.g. EP-0372501], synthetic peptides [e.g. EP-0378881, EP- 
30 0427347], heat shock proteins [e.g. W093/17712], pertussis proteins [e.g. W098/58668; EP-0471177], 
protein D from H.influenzae [e.g. WOOO/56360], toxin A or B from C.difficile [e.g. WO00/61761], eta 
Any suitable conjugation reaction can be used, with any suitable linker where necessary. 

Toxic protein antigens may be detoxified where necessary (e.g. detoxification of pertussis toxin by 
chemical and/or genetic means). 
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Where a diphtheria antigen is included in the composition it is preferred also to include tetanus antigen 
and pertussis antigens. Similarly, where a tetanus antigen is included it is preferred also to include 
diphtheria and pertussis antigens. Similarly, where a pertussis antigen is included it is preferred also to 
include diphtheria and tetanus antigens. 

5 Antigens are preferably adsorbed to an aluminium salt. 

Antigens in the composition will typically be present at a concentration of at least l^g/ml each. In 
general, the concentration of any given antigen will be sufficient to elicit an immune response against that 
antigen. 

The invention also provides compositions comprising two or more proteins of the present invention. 
10 The two or more proteins may comprise GBS sequences or may comprise GAS and GBS sequences. • 

A summary of standard techniques and procedures which may be employed to perform the invention 
(e.g. to utilise the disclosed sequences for vaccination or diagnostic purposes) follows. This summary is 
not a limitation on the invention but, rather, gives examples that may be used, but are not required. 

General 

15 The practice of the present invention will employ, unless otherwise indicated, conventional techniques of 
molecular biology, microbiology, recombinant DNA, and immunology, which are within the skill of the 
art Such techniques are explained fully in the literature eg. Sambrook Molecular Cloning; A Laboratory 
Manual, Second Edition (1989); DNA Cloning, Volumes I and II (D.N Glover ed. 1985); 
Oligonucleotide Synthesis (M.J. Gait ed, 1984); Nucleic Acid Hybridization (B.D. Hames & S.J. 

20 Higgins eds. 1984); Transcription and Translation (B.D. Hames & S.J. Higgins eds. 1984); Animal 
Cell Culture (RJ. Freshney ed. 1986); Immobilized Cells and Enzymes (IRL Press, 1986); B. Perbal, A 
Practical Guide to Molecular Cloning (1984); the Methods in Enzymology series (Academic Press, 
Inc.), especially volumes 154 & 155; Gene Transfer Vectors for Mammalian Cells (JH. Miller and M.P. 
Calos eds. 1987, Cold Spring Harbor Laboratory); Mayer and Walker, eds. (1987), Immunochemical 

25 Methods in Cell and Molecular Biology (Academic Press, London); Scopes, (1987) Protein 
Purification: Principles and Practice, Second Edition (Springer- Verlag, N.Y.), and Handbook of 
Experimental Immunology, Volumes I-W (D.M. Weir and C. C. Blackwell eds 1986). 
Standard abbreviations for nucleotides and amino acids are used in this specification. 
Definitions 

30 A con^osition containing X is "substantially free of 1 Y when at least 85% by weight of the total X+Y in the composition is X 
Preferably, X comprises at least about 90% by weight of the total of X+Y in the imposition, more preferably at least about 95% 
or even 99% by weight 

The tarn "comprising" means "including" as well as "consisting" e.g. a composition "comprising" X may consist exclusively of 
Xormayin(MesamdhingadditiOTaleg. X+Y 
35 The term "heterologous" refers to two biological components that are not found together in nature. The components may be host 
cells, genes, or regulatory regions, such as promoters Although the heterologous components are not found together in nature, 
they can function together, as when a promoter heterologous to a gene is operably linked to the gene. Another example is where a 
streptococcus sequence is heterologous to a mouse host celL A further examples would be two epitopes from the same or 
diift^ proteins which have been assembled in a single protein in an arrangement not found in nature 
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